1
00:00:11,353 --> 00:00:13,237
LiveTextAccess.

2
00:00:13,494 --> 00:00:16,821
Training for real-time
intralingual subtitlers.

3
00:00:18,692 --> 00:00:22,948
This is Unit 1.
Understanding accessibility.

4
00:00:23,098 --> 00:00:25,831
Element 1.
Basic concepts.

5
00:00:27,345 --> 00:00:31,170
This video lecture focuses
on multimodal communication,

6
00:00:31,270 --> 00:00:34,836
which is a specificity
of audiovisual translation

7
00:00:34,880 --> 00:00:37,000
and live situations.

8
00:00:37,178 --> 00:00:38,623
My name is Rocío Bernabé

9
00:00:39,016 --> 00:00:42,526
from the Internationale Hochschule
SDI München, in Germany.

10
00:00:43,295 --> 00:00:45,122
I have prepared this video lecture

11
00:00:45,225 --> 00:00:47,451
in collaboration
with the European Federation

12
00:00:47,638 --> 00:00:50,667
of Hard of Hearing,
in short, EFHOH.

13
00:00:52,897 --> 00:00:56,438
On completion of this training sequence,
you will be able to explain

14
00:00:56,557 --> 00:00:59,384
the concept
of multimodal communication,

15
00:00:59,472 --> 00:01:03,320
and to describe the challenges
that real-time subtitlers

16
00:01:03,400 --> 00:01:05,258
and end-users face.

17
00:01:07,544 --> 00:01:09,290
Let's take a look at the agenda.

18
00:01:10,118 --> 00:01:14,118
We start talking about
what modes are

19
00:01:14,160 --> 00:01:15,720
in audiovisual translation

20
00:01:15,800 --> 00:01:19,478
and why communication
is considered to be multimodal.

21
00:01:20,242 --> 00:01:22,912
Then we discuss multimodality

22
00:01:23,000 --> 00:01:26,247
in the context
of real-time intralingual subtitling,

23
00:01:26,574 --> 00:01:31,281
and we also talk about the intricacies
of conveying information through a mode

24
00:01:31,381 --> 00:01:33,055
that is not the original one.

25
00:01:34,152 --> 00:01:39,600
The concept of multimodality seems
easy to understand at first glance.

26
00:01:40,489 --> 00:01:45,083
In audiovisual translation,
scholars such as Jorge Díaz Cintas

27
00:01:45,249 --> 00:01:48,092
classify modes
into 2 categories:

28
00:01:48,228 --> 00:01:49,589
audio and video.

29
00:01:50,187 --> 00:01:55,351
Modes help us to classify  in which way
a specific resource is realised.

30
00:01:55,984 --> 00:01:59,507
In multimodal communication,
resources are realised

31
00:01:59,611 --> 00:02:01,533
either visually or aurally.

32
00:02:03,317 --> 00:02:10,134
Sociolinguists and semiotic scholars,
such as Halliday, Kress or Van Leeuwen,

33
00:02:10,355 --> 00:02:16,084
explain that there are many different
types of resources within a culture.

34
00:02:16,607 --> 00:02:19,681
These resources can be verbal,
such as language,

35
00:02:19,766 --> 00:02:24,071
or non-verbal, such as gestures,
images, sounds, or objects,

36
00:02:24,824 --> 00:02:27,744
for example clothes or food.

37
00:02:28,980 --> 00:02:33,759
Depending on the type of resource,
a speaker can choose the video

38
00:02:33,853 --> 00:02:36,233
or the audio mode
for its realization.

39
00:02:36,363 --> 00:02:40,687
For instance, words are a resource
that can be realised aurally,

40
00:02:41,105 --> 00:02:42,435
through the audio mode,

41
00:02:43,422 --> 00:02:47,523
and visually,
by using subtitles, for example.

42
00:02:49,838 --> 00:02:53,120
When a message is rendered
multimodally,

43
00:02:53,200 --> 00:02:56,457
the audience needs to access
both channels

44
00:02:56,683 --> 00:02:59,266
to receive the complete message.

45
00:03:00,000 --> 00:03:02,213
However,
this is not always the case.

46
00:03:03,412 --> 00:03:08,268
The reasons why one channel
may not be available are manifold,

47
00:03:08,320 --> 00:03:12,867
and can range from a noisy environment
to a hearing loss.

48
00:03:14,269 --> 00:03:17,644
In such cases,
alternatives need to be available.

49
00:03:18,551 --> 00:03:22,433
This is the essence of the work
of audiovisual translators

50
00:03:22,611 --> 00:03:25,097
and the purpose
of access services.

51
00:03:25,193 --> 00:03:29,042
That is, to provide an alternative way
to access the information

52
00:03:29,120 --> 00:03:32,980
that is not reaching the audience
through the original channel.

53
00:03:35,592 --> 00:03:41,224
Our job is to enable a diamesic change
from one mode to another,

54
00:03:41,353 --> 00:03:43,880
which has been described
by Carlo Eugeni

55
00:03:43,984 --> 00:03:45,957
as "diamesic translation".

56
00:03:46,996 --> 00:03:52,946
For instance, dialogues or narrations
that are rendered aurally in an original

57
00:03:53,285 --> 00:03:56,658
can be conveyed visually
using subtitles.

58
00:03:57,669 --> 00:04:03,240
In real-time subtitling,
subtitler generates this visual information

59
00:04:03,320 --> 00:04:04,901
that is then added

60
00:04:05,317 --> 00:04:09,311
to the original resources
that were already rendered visually.

61
00:04:10,748 --> 00:04:13,027
This change from one mode
to another

62
00:04:13,359 --> 00:04:16,560
includes words and other resources
that are necessary

63
00:04:16,640 --> 00:04:18,274
to understand a message.

64
00:04:18,721 --> 00:04:19,546
For example, sounds,

65
00:04:19,768 --> 00:04:23,887
contextual information,
and identifying a speaker.

66
00:04:24,471 --> 00:04:28,360
For instance, at a conference,
subtitlers may render sounds,

67
00:04:28,440 --> 00:04:31,307
like an "APPLAUSE"
after a speech

68
00:04:31,400 --> 00:04:34,215
or a sound to which
a speaker may react

69
00:04:34,730 --> 00:04:38,293
such as siren from outside,
or someone sneezing,

70
00:04:38,399 --> 00:04:40,795
or a loud bang
in another room.

71
00:04:41,904 --> 00:04:46,593
This brings us to the challenges
that a real-time subtitler face.

72
00:04:48,358 --> 00:04:50,359
The challenge
of multimodality.

73
00:04:51,320 --> 00:04:54,589
The challenges
that real-time subtitlers face

74
00:04:54,640 --> 00:04:57,814
in the process
of rendering resources visually

75
00:04:57,923 --> 00:05:00,625
emerge from 3
main constraints.

76
00:05:01,293 --> 00:05:05,611
These are a limited amount of time
and space for our subtitles,

77
00:05:05,694 --> 00:05:06,844
and latency.

78
00:05:07,133 --> 00:05:10,579
Latency refers to
the maximum delay or time

79
00:05:10,672 --> 00:05:14,210
by which subtitles should appear
on a screen.

80
00:05:16,545 --> 00:05:21,601
Subtitles should coincide
as much as possible with speech onset.

81
00:05:22,088 --> 00:05:25,823
A minimum delay supports
understanding and lip-reading,

82
00:05:26,102 --> 00:05:28,102
which is an additional input cue

83
00:05:28,188 --> 00:05:31,885
that persons with hearing loss
often use in communication.

84
00:05:32,602 --> 00:05:37,220
Some examples of maximum delay
in different contexts are:

85
00:05:37,691 --> 00:05:42,414
6 seconds for TV,
6 to 8 seconds in parliaments,

86
00:05:42,507 --> 00:05:45,141
 and 3 seconds at conferences.

87
00:05:47,912 --> 00:05:52,865
These constraints of real-time situations
have clear implications for subtitlers,

88
00:05:53,215 --> 00:05:56,888
who will continuously have to choose
what resources to render.

89
00:05:57,864 --> 00:06:02,635
These choices are influenced
by how well-organised a speaker is,

90
00:06:02,746 --> 00:06:05,477
and how fast he or she speaks,

91
00:06:05,741 --> 00:06:07,795
and by the working context.

92
00:06:08,507 --> 00:06:10,163
Let's see some examples.

93
00:06:11,586 --> 00:06:12,477
In parliaments,

94
00:06:12,570 --> 00:06:16,723
the most important features
to be subtitled are, in this order:

95
00:06:17,774 --> 00:06:21,409
speech, which should be
as verbatim as possible,

96
00:06:21,533 --> 00:06:25,762
and without features of orality,
such as tone or stress.

97
00:06:26,889 --> 00:06:29,080
Then, speaker identification.

98
00:06:29,585 --> 00:06:33,638
This is especially important
because words need to belong

99
00:06:33,680 --> 00:06:35,173
to the actual speaker.

100
00:06:35,346 --> 00:06:37,901
Otherwise,
diplomatic incidents could occur.

101
00:06:38,805 --> 00:06:40,797
Then, contextual information,

102
00:06:40,887 --> 00:06:44,119
which becomes key
when voting takes place.

103
00:06:44,620 --> 00:06:47,432
In voting cases,
the other resources

104
00:06:47,542 --> 00:06:51,240
(speech, speaker identification,
slides, etc.)

105
00:06:51,407 --> 00:06:53,022
are of less importance.

106
00:06:53,482 --> 00:06:55,788
Lastly, other materials.

107
00:06:56,073 --> 00:06:56,859
In parliaments,

108
00:06:57,029 --> 00:07:00,994
rarely happens that somebody brings
things with him or her

109
00:07:01,310 --> 00:07:03,434
such as pictures, or slides.

110
00:07:04,145 --> 00:07:07,726
In most cases,
this information is not relevant

111
00:07:07,800 --> 00:07:09,924
and will not be prioritised.

112
00:07:10,964 --> 00:07:13,393
Lastly, an example
from conferences.

113
00:07:15,542 --> 00:07:18,514
At conferences,
speech is also prioritised

114
00:07:18,624 --> 00:07:20,922
as it is in parliaments.

115
00:07:21,809 --> 00:07:25,259
Identifying a speaker is often
less important in conferences

116
00:07:25,359 --> 00:07:28,190
because it is usually
quite clear who is speaking,

117
00:07:28,436 --> 00:07:31,738
especially when only one
speaker is on stage.

118
00:07:32,362 --> 00:07:35,990
However, identifying a speaker
may be relevant

119
00:07:36,000 --> 00:07:39,609
when there is a debate
and speakers start to switch.

120
00:07:40,149 --> 00:07:41,000
In these cases,

121
00:07:41,080 --> 00:07:43,874
identifying the speaker
becomes more critical,

122
00:07:44,015 --> 00:07:49,155
as subtitlers will have to pay
more attention to mentioning the names.

123
00:07:50,452 --> 00:07:54,704
Another case of speaker identification
at conferences would be

124
00:07:54,920 --> 00:07:58,437
when an interpreter says something
for him or herself.

125
00:07:59,753 --> 00:08:03,468
For example,
a simultaneous interpreter may say:

126
00:08:03,712 --> 00:08:05,733
"I cannot hear the speaker".

127
00:08:05,895 --> 00:08:08,875
Or "the microphone is shut off".

128
00:08:09,678 --> 00:08:12,404
In such cases,
it is a small challenge

129
00:08:12,509 --> 00:08:16,558
to show in your text,
as subtitler, very clearly,

130
00:08:16,897 --> 00:08:19,560
that this is something
that the interpreter says,

131
00:08:19,640 --> 00:08:21,869
and not the original speaker.

132
00:08:23,824 --> 00:08:28,920
Sounds, like applause, are often included
in subtitles, at conferences,

133
00:08:29,143 --> 00:08:32,957
whereas contextual information
such as "irony" is less common

134
00:08:33,000 --> 00:08:35,067
because the interaction is live.

135
00:08:36,080 --> 00:08:38,367
Ok, let's recap now.

136
00:08:39,596 --> 00:08:43,910
Multimodal communication makes
communication exciting and complex

137
00:08:43,960 --> 00:08:45,392
at the same time.

138
00:08:45,748 --> 00:08:50,545
Moreover, multimodality often requires
a higher effort from both

139
00:08:50,730 --> 00:08:52,855
viewers and subtitlers.

140
00:08:53,475 --> 00:08:54,428
On the one hand,

141
00:08:54,543 --> 00:08:58,241
viewers or end-users will perceive
more information

142
00:08:58,361 --> 00:08:59,614
through the visual mode,

143
00:09:00,369 --> 00:09:03,401
and at a pace
that is set by the speaker.

144
00:09:04,565 --> 00:09:08,379
On the other, subtitlers
continuously have to make choices

145
00:09:08,440 --> 00:09:11,861
about what resources
should be rendered and when.

146
00:09:13,042 --> 00:09:14,586
Depending on the context,

147
00:09:14,666 --> 00:09:18,874
this will mean to add information
or, conversely, to reduce

148
00:09:18,920 --> 00:09:23,228
or condense the message
to provide subtitles in synchrony

149
00:09:23,310 --> 00:09:26,312
with the speech onset
with a minimum delay.

150
00:09:27,313 --> 00:09:31,504
You will learn how to do this
in Unit 5 and Unit 6

151
00:09:31,616 --> 00:09:36,470
with our colleagues Wim Gerbecks,
Carlo Eugeni and Silvia Velardi.

152
00:09:37,050 --> 00:09:38,815
As for now, I say goodbye.

153
00:09:39,592 --> 00:09:40,802
Exercises.

154
00:09:41,804 --> 00:09:45,791
The exercises for this video lecture are
in the Trainer’s Guide

155
00:09:45,890 --> 00:09:46,856
for Unit 1

156
00:09:46,944 --> 00:09:49,174
and in the PowerPoint presentation.

157
00:10:00,364 --> 00:10:02,806
LTA - LiveTextAccess.

158
00:10:03,531 --> 00:10:06,081
Universitat Autònoma de Barcelona.

159
00:10:07,152 --> 00:10:10,319
SDI - Internationale Hochschule.

160
00:10:11,454 --> 00:10:15,060
Scuola Superiore
per Mediatori Linguistici.

161
00:10:16,109 --> 00:10:17,699
2DFDigital.

162
00:10:18,793 --> 00:10:22,147
The European Federation
of Hard of Hearing People – EFHOH.

163
00:10:23,249 --> 00:10:24,397
VELOTYPE.

164
00:10:25,177 --> 00:10:26,548
SUB-TI ACCESS.

165
00:10:27,551 --> 00:10:32,661
European Certification
and Qualification Association – ECQA.

166
00:10:35,886 --> 00:10:39,900
Co-funded by the Erasmus+ Programme
of the European Union.

167
00:10:41,904 --> 00:10:55,960
Erasmus+ Project:
2018-1-DE01-KA203-004218.

168
00:10:57,240 --> 00:11:00,600
The information and views
set on this presentation

169
00:11:00,960 --> 00:11:02,763
are those of the authors

170
00:11:02,920 --> 00:11:06,480
and do not necessarily reflect
the official opinion

171
00:11:06,800 --> 00:11:08,120
of the European Union.

172
00:11:09,240 --> 00:11:12,880
Neither the European Union
institutions and bodies

173
00:11:13,440 --> 00:11:16,040
nor any person acting
on their behalf

174
00:11:16,640 --> 00:11:19,320
may be held responsible
for the use

175
00:11:19,680 --> 00:11:23,000
which may be made
of the information contained here.